Data processing system

ABSTRACT

A data processing system includes a data processing unit which processes data acquired and a plurality of data retaining units which store databases used to process the data. Each of the plurality of data retaining units stores a primary database in common and stores the respective shares of a secondary database. The data processing system includes at least one more data retaining unit which can store the primary database and the respective shares of the secondary database.

TECHNICAL FIELD

The present invention relates to a data processing technique,particularly to a technique for operating multiple databases.

BACKGROUND ART

Due to improved Internet infrastructures and the widespread ofcommunication terminals, such as cellular phone terminals, personalcomputers, and VoIP (Voice over Internet Protocol) phone sets, thenumber of Internet users is now exploding. Under such circumstances,security problems such as computer viruses, hacking and spam mails havebecome apparent, requiring appropriate techniques for communicationcontrol.

The Internet has enabled easy access to a vast amount of information. Onthe other hand, harmful information is proliferating thereon andregulation on its originator does not keep up with the proliferation. Toprovide an environment where everyone can use the Internet safely andeffectively, there is required an appropriate technique for controllingaccess to harmful contents.

For example, there has been proposed an access control technique inwhich are prepared databases containing lists of sites to which accessis permitted or prohibited, forbidden keywords or useful keywords, so asto control access to external information via the Internet withreference to such databases (see Patent Document 1, for example).

[Patent Document 1] Japanese Patent Application Laid-open No.2001-282797.

DISCLOSURE OF INVENTION Problem to be Solved by the Invention

If the number of users increases in a system where access control isperformed as described in Patent Document 1, the amount of the databasesmay increase enormously and may possibly exceed the capacity that can bestored. In such case, replacing the entire system is inefficient as itrequires unnecessary costs. It is also inefficient to prepare anothersystem to be operated in the event that the operation of the main systemhas to be halted, such as when updating databases; the larger thesystem, the less efficient it will be.

The present invention has been made in view of such a situation, and ageneral purpose thereof is to provide a technique for operating multipledatabases appropriately.

Means for Solving the Problem

One aspect of the present invention relates to a data processingapparatus. The data processing system comprises: a data processing unitwhich processes data acquired; and a plurality of data retaining unitswhich store databases used to process the data, wherein: each of theplurality of data retaining units stores a primary database in commonand stores the respective shares of a secondary database; and the dataprocessing system further comprises at least one more data retainingunit which can store the primary database and the respective shares ofthe secondary database.

In such way, when databases are shared and stored by multiple dataretaining units, which are operated cooperatively, the size of each ofthe data retaining units can be reduced. This can consequently reducecosts and man-hours required for the expansion of the system scale dueto increase of data amount or the like. Also, there is no need to duplexthe entirety of a large data processing apparatus to prepare a standbyunit for database updating or the like. Instead, since only at least oneof the comparatively small data retaining units need be provided extraas a standby unit, the system configuration can be simplified, therebyreducing initial investment or operational costs.

The primary database may contain data for determining which share of thesecondary database is to be used to process the data. For example, theprimary database may contain data for user authentication, while thesecondary database may contain information on data processing for eachuser, etc.

The data processing system may further comprise an operation managementunit, which manages the operating state of the plurality of dataretaining units. The operation management unit may operate as many asthe number of data retaining units required to share and store thesecondary database, and may place the other data retaining unit onstandby; when a database retained in the data retaining units isupdated, the operation management unit may store, in a data retainingunit on standby, updated data of the database retained in any one of thedata retaining units in operation, and may subsequently stop theoperation of the data retaining unit storing the database before updateand place the data retaining unit storing the updated database inoperation. Thus, databases can be updated without halting the operation.

When detecting a data retaining unit in operation being inoperable, theoperation management unit may store the database retained by the dataretaining unit in a data retaining unit on standby, and may place thedata retaining unit on standby in operation. Thus, even if one of thedata retaining units stops because of failure or the like, the mainoperation will be continued properly.

The data retaining unit on standby may store the primary database inadvance. If the data retaining unit stores, in advance, the primarydatabase, which is used mutually and contains data for determining whichshare of the secondary database is to be used to process the data, thedata retaining unit on standby can be placed in operation promptly eventhough one of the data retaining units in operation becomes inoperable.

A plurality of the data processing units may be provided so as tocorrespond to the plurality of data retaining units respectively. Also,the data processing system may further comprise a data supply unit,which provides acquired data to the plurality of data processing unitsin parallel. This enables appropriate data processing using the dataprocessing units even in the case where a data retaining unit is furtheradded, or the case where the content of a database retained by therespective data retaining units is changed because of database updatingor the like.

The data supply unit may provide acquired data as it is to the pluralityof data processing units in parallel without processing the data.Consequently, the data supply unit need not process data, therebyimproving the data processing speed.

Upon acquisition of data from the data supply unit, each of theplurality of data processing units may refer to a database retained inthe corresponding data retaining unit so as to determine whether or notto process the data. Accordingly, data can be appropriately processed bythe proper data processing unit.

The data processing units may be communication control apparatuses whichacquire packets to control communications, and, upon acquisition of apacket from the data supply unit, each of the data processing units mayacquire the packet without determining whether the packet is directed tothe data processing unit itself, and may refer to a database retained inthe corresponding data retaining unit so as to determine whether or notto process the packet. Consequently, the data processing units need notcheck MAC addresses or IP addresses, thereby improving the packetprocessing speed.

Each of the communication control apparatuses may use information storedin the data portion of the packet instead of information stored in theheader portion thereof to determine whether or not to process thepacket. Also, each of the communication control apparatuses may refer tothe primary database retained in the corresponding data retaining unitso as to determine whether or not to process the packet.

A data processing unit that has determined not to process the packet maydiscard the packet, while a data processing unit that has determined toprocess the packet processes the packet.

The data supply unit may provide to the plurality of communicationcontrol apparatuses in parallel an acquired packet as a unicast packetwithout converting the packet to a broadcast packet. Consequently, thedata supply unit need not, for example, process the header of the packetto convert the packet to a broadcast, thereby improving the packetprocessing speed.

Optional combinations of the aforementioned constituting elements, andimplementations of the invention in the form of methods, apparatuses,systems, recording mediums and computer programs may also be practicedas additional modes of the present invention.

ADVANTAGEOUS EFFECTS

The present invention provides a technique for operating multipledatabases appropriately.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a diagram that shows a configuration of a communicationcontrol system according to a base technology.

FIG. 2 is a diagram that shows a configuration of a conventionalcommunication control apparatus.

FIG. 3 is a diagram that shows a configuration of a communicationcontrol apparatus according to the base technology.

FIG. 4 is a diagram that shows an internal configuration of a packetprocessing circuit.

FIG. 5 is a diagram that shows an internal configuration of a positiondetection circuit.

FIG. 6 is a diagram that shows an example of internal data of a firstdatabase.

FIG. 7 is a diagram that shows another example of internal data of thefirst database.

FIG. 8 is a diagram that shows yet another example of internal data ofthe first database.

FIG. 9 is a diagram that shows a configuration of comparison circuitsincluded in a binary search circuit.

FIG. 10 is a diagram that shows an example of internal data of a seconddatabase.

FIG. 11 is a diagram that shows another example of internal data of thesecond database.

FIG. 12 is a diagram that shows an internal configuration of the packetprocessing circuit used for URL filtering.

FIG. 13A is a diagram that shows an example of internal data of a viruslist; FIG. 13B is a diagram that shows an example of internal data of awhitelist; and FIG. 13C is a diagram that shows an example of internaldata of a blacklist.

FIG. 14 is a diagram that shows an example of internal data of a commoncategory list.

FIGS. 15A, 15B, 15C and 15D are diagrams that show examples of internaldata of the second database.

FIG. 16 is a diagram that shows the priorities of the virus list,whitelist, blacklist and common category list.

FIG. 17 is a diagram that shows a configuration of a communicationcontrol system according to an embodiment.

FIG. 18 is a diagram that shows configurations of communication controlapparatuses according to the embodiment.

FIG. 19 is a diagram that shows an example of internal data of amanagement table provided in an operation monitoring server.

FIG. 20 is a diagram for describing an operational procedure performedin the event that a communication control apparatus fails.

FIGS. 21A, 21B and 21C are diagrams for describing a procedure forupdating databases in the communication control apparatuses.

FIG. 22 is a diagram that shows a configuration of a communication pathcontrol apparatus provided to process packets with multiplecommunication control apparatuses.

EXPLANATION OF REFERENCE NUMERALS

-   -   10 communication control apparatus    -   20 packet processing circuit    -   30 search circuit    -   32 position detection circuit    -   33 comparison circuit    -   34 index circuit    -   35 comparison circuit    -   36 binary search circuit    -   40 process execution circuit    -   50 first database    -   57 user database    -   60 second database    -   100 communication control system    -   110 operation monitoring server    -   111 management table    -   120 connection management server    -   130 message output server    -   140 log management server    -   150 database server    -   160 URL database    -   161 virus list    -   162 whitelist    -   163 blacklist    -   164 common category list    -   200 communication path control apparatus    -   210 switch    -   220 optical splitter    -   230 switch

BEST MODE FOR CARRYING OUT THE INVENTION

(Base Technology)

First, as a base technology, a communication control apparatus will bedescribed as an illustrative data processing apparatus, and theconfiguration of its peripheral apparatuses and the outline of theoperation will be also explained. Thereafter, there will be described aURL filtering technique using the communication control apparatus beforea technique for operating multiple communication control apparatuseswill be described as an embodiment.

FIG. 1 shows a configuration of a communication control system accordingto the base technology. A communication control system 100 comprises acommunication control apparatus 10 and various peripheral apparatusesprovided to support the operation of the communication control apparatus10. The communication control apparatus 10 of the base technologyperforms a URL filtering function provided by an Internet serviceprovider or the like. The communication control apparatus 10 provided ona network path acquires a request for access to a content, analyzes thecontent, and determines whether or not the access to the content shouldbe permitted. If the access to the content is permitted, thecommunication control apparatus 10 will transmit the access request to aserver that retains the content. If the access to the content isprohibited, the communication control apparatus 10 will discard theaccess request and return a warning message or the like to the source ofthe request. The communication control apparatus 10 of the basetechnology receives an access request, such as an HTTP (HyperTextTransfer Protocol) “GET” request message. The apparatus then searches alist of reference data for determining access permission to check if theURL of the content to be accessed appears in the list, so as todetermine whether or not the access to the content should be permitted.

The peripheral apparatuses include an operation monitoring server 110, aconnection management server 120, a message output server 130, a logmanagement server 140 and a database server 150. The connectionmanagement server 120 manages connection to the communication controlapparatus 10. When the communication control apparatus 10 processes apacket transmitted from a cellular phone terminal, for example, theconnection management server 120 authenticates the user as a user of thecommunication control apparatus 10, based on information included in thepacket, which uniquely identifies the cellular phone terminal. Once theuser is authenticated, packets transmitted from the IP address, which istemporarily provided for the cellular phone terminal, will betransmitted to the communication control apparatus 10 and processedtherein, without being authenticated by the connection management server120 during a certain period. The message output server 130 outputs amessage to the destination or the source of an access request, accordingto whether the communication control apparatus 10 has permitted theaccess. The log management server 140 manages the operating history ofthe communication control apparatus 10. The database server 150 acquiresthe latest database from a URL database 160 and provides the database tothe communication control apparatus 10. To update the database withouthalting the operation of the communication control apparatus 10, theapparatus may possess a backup database. The operation monitoring server110 monitors the operating state of the communication control apparatus10 and its peripheral apparatuses including the connection managementserver 120, message output server 130, log management server 140 anddatabase server 150. The operation monitoring server 110 has the highestpriority in the communication control system 100 and performssupervisory control of the communication control apparatus 10 and allthe peripheral apparatuses. The communication control apparatus 10 isconfigured with a dedicated hardware circuit, as will be describedlater. By inputting to or outputting from the communication controlapparatus 10 the data for monitoring by means of a boundary-scancircuit, based on the technique described in Japanese Patent No. 3041340filed by the present applicant or other techniques, the operationmonitoring server 110 can monitor the operating state even while thecommunication control apparatus 10 is in operation.

In the communication control system 100 of the base technology, as willbe described below, the communication control apparatus 10, configuredwith a dedicated hardware circuit for faster operation, is controlled byusing a group of peripheral servers connected thereto and having variousfunctions. Accordingly, by suitably replacing the software of the groupof servers, a wide variety of functions can be achieved with a similarconfiguration. Thus, the base technology provides such communicationcontrol system having high flexibility.

FIG. 2 shows a configuration of a conventional communication controlapparatus 1. The conventional communication control apparatus 1comprises a communication control unit 2 on the receiving side, a packetprocessing unit 3, and a communication control unit 4 on the sendingside. The communication control units 2 and 4 include PHY processingunits 5 a and 5 b for performing physical layer processing of packets,and MAC processing units 6 a and 6 b for performing MAC layer processingof packets, respectively. The packet processing unit 3 includes protocolprocessing units for performing protocol-specific processing, such as anIP processing unit 7 for performing IP (Internet Protocol) processingand a TCP processing unit 8 for performing TCP (Transport ControlProtocol) processing. The packet processing unit 3 also includes an APprocessing unit 9 for performing application layer processing. The APprocessing unit 9 performs filtering or other processing according todata included in a packet.

The packet processing unit 3 of the conventional communication controlapparatus 1 is implemented by software, using a general-purposeprocessor, or CPU, and an OS running on the CPU. With suchconfiguration, however, the performance of the communication controlapparatus 1 depends on the performance of the CPU, hampering thecreation of a communication control apparatus capable of high-speedprocessing of a large volume of packets. For example, a 64-bit CPU canprocess only up to 64 bits at a time, and hence, there has existed nocommunication control apparatus having a higher performance than this.In addition, since the conventional communication control apparatus ispredicated on the presence of an OS with versatile functionality, thepossibility of security holes cannot be eliminated completely, requiringmaintenance work including OS upgrades.

FIG. 3 shows a configuration of a communication control apparatus in thebase technology. The communication control apparatus 10 comprises apacket processing circuit 20 configured with dedicated hardwareemploying a wired logic circuit, instead of the packet processing unit 3implemented by software including a CPU and an OS in the conventionalcommunication control apparatus 1 shown in FIG. 2. By providing adedicated hardware circuit to process communication data, rather thanprocessing it with an OS and software running on a general-purposeprocessing circuit such as CPU, the performance limitations posed by theCPU or OS can be overcome, enabling a communication control apparatushaving high throughput.

For example, a case will be considered here in which, in packetfiltering or the likes a search is conducted to check if the data in apacket includes reference data, which serves as criteria for filtering.When a CPU is used to compare the communication data with the referencedata, there occurs a problem in that, since only 64-bit data can becompared at a time, the processing speed cannot be improved beyond suchCPU performance. Since the CPU needs to repeat the process of loading 64bits of communication data into a memory and comparing it with thereference data, the memory load time becomes a bottleneck which limitsthe processing speed.

In the base technology, by contrast, a dedicated hardware circuitconfigured with a wired logic circuit is provided to comparecommunication data with reference data. This circuit includes multiplecomparators arranged in parallel, so as to enable the comparison of datahaving a length greater than 64 bits, such as 1024 bits. By providingdedicated hardware in such manner, bit matching can be simultaneouslyperformed on a large number of bits in parallel. Since 1024-bit data canbe processed at a time, while the conventional communication controlapparatus 1 using a CPU processes only 64 bits, the processing speed canbe improved remarkably. Increasing the number of comparators willimprove the throughput, but also increase the cost and size of theapparatus. Accordingly, an optimal hardware circuit may be designed inaccordance with the desired performance, cost or size. The dedicatedhardware circuit may be configured using FPGA (Field Programmable GateArray), etc.

Since the communication control apparatus 10 of the base technology isconfigured with dedicated hardware employing a wired logic circuit, itdoes not require any OS (Operating System). This can eliminate the needfor the installation, bug fixes, or version upgrades of an OS, therebyreducing the cost and man-hours required for administration andmaintenance. Also, unlike CPUs requiring versatile functionality, thecommunication control apparatus 10 does not include any unnecessaryfunctions or use needless resources, and hence, reduced cost, a smallercircuit area or improved processing speed can be expected. Furthermore,again unlike conventional OS-based communication control apparatuses,the absence of unnecessary functions decreases the possibility ofsecurity holes and thus enhances the tolerance against attacks frommalicious third parties over a network.

FIG. 4 shows an internal configuration of the packet processing circuit.The packet processing circuit 20 comprises: a first database 50 forstoring reference data to be referred to when determining processing tobe performed on communication data; a search circuit 30 for searchingreceived communication data for the reference data by comparing the two;a second database 60 for storing a search result of the search circuit30 and a content of processing to be performed on the communicationdata, which are related to each other; and a process execution circuit40 for processing the communication data based on the search result ofthe search circuit 30 and the conditions stored in the second database60.

The search circuit 30 includes: a position detection circuit 32 fordetecting the position of comparison target data, which is to becompared with reference data, in communication data; an index circuit 34which serves as an example of a determination circuit for determiningwhich range the comparison target data belongs to, among three or moreranges into which the reference data stored in the first database 50 isdivided; and a binary search circuit 36 for searching the determinedrange for the reference data that matches the comparison target data.The reference data may be searched for the comparison target data usingany search technique, and a binary search method is used in the basetechnology.

FIG. 5 shows an internal configuration of the position detectioncircuit. The position detection circuit 32 includes multiple comparisoncircuits 33 a-33 f which compare communication data with positionidentification data for identifying the position of comparison targetdata. While six comparison circuits 33 a-33 f are provided here, thenumber of comparison circuits may be arbitrary, as will be describedlater. To the comparison circuits 33 a-33 f are input pieces ofcommunication data, with each piece shifted from the preceding one by apredetermined data length, such as 1 byte. These multiple comparisoncircuits 33 a-33 f then simultaneously compare the communication datawith the position identification data to be detected in parallel.

The base technology will be described by way of example for explainingthe operation of the communication control apparatus 10, in which acharacter string “No. ###” in communication data is detected, the number“###” included in the character string is then compared with referencedata, and if the number matches the reference data, the packet will beallowed to pass, while, if they do not match, the packet will bediscarded.

In the example of FIG. 5, communication data “01No. 361 . . . ” is inputto the comparison circuits 33 a-33 f with a shift of one character each,and position identification data “No.” for identifying the position ofthe number “###” is sought to be detected in the communication data.More specifically, “01N” is input to the comparison circuit 33 a, “1No”to the comparison circuit 33 b, “No.” to the comparison circuit 33 c,“o.” to the comparison circuit 33 d, “. 3” to the comparison circuit 33e, and “36” to the comparison circuit 33 f. Then, the comparisoncircuits 33 a-33 f simultaneously perform comparisons with the positionidentification data “No.”. Consequently, there is found a match with thecomparison circuit 33 c, indicating that the character string “No.”exists at the third character from the top of the communication data.Thus, it is determined that the numeral data as comparison target dataexists subsequent to the position identification data “No.” detected bythe position detection circuit 32.

When the same processing is performed by a CPU, since the comparisonprocess needs to be serially performed one by one from the top, such ascomparing character strings “01N” and “No.” before comparing “1No” and“No.”, no improvement of detection speed can be expected. In thecommunication control apparatus 10 of the base technology, in contrast,providing the multiple comparison circuits 33 a-33 f in parallel enablessimultaneous parallel comparison processing, which could not have beenperformed by a CPU, improving the processing speed significantly.Providing more comparison circuits will improve the detection speed, asmore characters can be compared simultaneously. In consideration of costor size, a sufficient number of comparison circuits may be provided toachieve a desired detection speed.

Aside from detecting position identification data, the positiondetection circuit 32 may also be used as a circuit for detectingcharacter strings for various purposes. Moreover, the position detectioncircuit 32 may be configured to detect position identification data inunits of bits, not just as a character string.

FIG. 6 shows an example of internal data of the first database. Thefirst database 50 stores reference data to be referred to whendetermining the processing on packets, such as filtering, routing,switching, and replacement. The pieces of reference data are sortedaccording to some sort conditions. In the example of FIG. 6, 1000 piecesof reference data are stored.

The top record of the first database 50 contains an offset 51 whichindicates the position of comparison target data in communication data.For example, in a TCP packet, the data configuration within the packetis determined in units of bits. Therefore, if the position of flaginformation or the like for determining the processing on the packet isgiven in the form of the offset 51, the processing can be determined bycomparing only necessary bits, thus improving the processing efficiency.Also, even when the configuration of packet data is changed, it can besettled by modifying the offset 51 accordingly. The first database 50may store the data length of comparison target data. In this case, sincethe comparison can be performed by operating only a required number ofcomparators, the search efficiency can be improved.

The index circuit 34 determines which range the comparison target databelongs to, among three or more ranges, such as 52 a-52 d, into whichreference data stored in the first database 50 is divided. In theexample of FIG. 6, the 1000 pieces of reference data are divided intofour ranges 52 a-52 d, i.e., 250 pieces each. The index circuit 34includes multiple comparison circuits 35 a-35 c, each of which comparesa piece of reference data at the border of the range with the comparisontarget data. Since the comparison circuits 35 a-35 c simultaneouslycompare the pieces of reference data at the borders with the comparisontarget data in parallel, which range the comparison target data belongsto can be determined by a single operation of comparison processing.

As mentioned previously, CPU-based binary search cannot make multiplecomparisons at the same time. In the communication control apparatus 10of the base technology, in contrast, providing the multiple comparisoncircuits 35 a-35 c in parallel enables simultaneous parallel comparisonprocessing, with a significant improvement in the search speed.

After the index circuit 34 determines the relevant range, the binarysearch circuit 36 performs a search using a binary search method. Thebinary search circuit 36 divides the range determined by the indexcircuit 34 further into two and subsequently compares the piece ofreference data lying at the border with the comparison target data,thereby determining which range the comparison target data belongs to.The binary search circuit 36 includes multiple comparison circuits forcomparing, bit by bit, reference data with comparison target data. Forexample, in the base technology are provided 1024 comparison circuits toperform bit matching on 1024 bits simultaneously. When the range towhich the comparison target data belongs is determined between the twosplit ranges, the determined range is further divided into two. Then,the reference data lying at the border is read out to be compared withthe comparison target data. Thereafter, this processing is repeated tonarrow the range further until reference data that matches thecomparison target data is eventually found.

The operation will now be described in more detail in conjunction withthe foregoing example. In the communication data shown in FIG. 5, thenumber “361” is the comparison target data that follows the positionidentification data “No.”. Since a single space character intervenesbetween the position identification data “No.” and the comparison targetdata “361”, the offset 51 is set to “8” bits in order to exclude thespace from the comparison target data. Accordingly, the binary searchcircuit 36 skips the first “8” bits, or 1 byte, of the communicationdata subsequent to the position identification data “No.” and reads thefollowing “361” as the comparison target data.

Each of the comparison circuits 35 a-35 c of the index circuit 34receives “361” as comparison target data. As for reference data, thecomparison circuit 35 a receives “378”, which lies at the border of theranges 52 a and 52 b. Similarly, the comparison circuit 35 b receivesreference data “704” lying at the border of the ranges 52 b and 52 c,and the comparison circuit 35 c receives reference data “937” lying atthe border of the ranges 52 c and 52 d. The comparison circuits 35 a-35c then perform comparisons simultaneously, determining that thecomparison target data “361” belongs to the range 52 a. Subsequently,the binary search circuit 36 searches the reference data for thecomparison target data “361”.

FIG. 7 shows another example of internal data of the first database. Inthe example shown in FIG. 7, the number of pieces of reference data issmaller than the number of pieces of data storable in the first database50, i.e., 1000 in this case. In such instance, the first database 50stores the pieces of reference data in descending order, starting withthe last data position therein. Then, 0 is stored in the rest of thedata positions. The database is loaded with data not from the top butfrom the bottom of the loading area, and all the vacancies occurring inthe front of the loading area, if any, are replaced with zero.Consequently, the database is fully loaded at any time, so that themaximum time necessary for a binary search will be constant. Moreover,if the binary search circuit 36 reads reference data “0” during asearch, the circuit can identify the range without making a comparison,as the comparison result is obvious, and can proceed to the nextcomparison. Consequently, the search speed can be improved.

In CPU-based software processing, the first database 50 stores pieces ofreference data in ascending order, from the first data position therein.In the rest of data positions will be stored a maximum value or thelike, and in such case, the skip of comparison processing as describedabove cannot be made during a binary search. The comparison techniquedescribed above can be implemented by configuring the search circuit 30with a dedicated hardware circuit.

FIG. 8 shows yet another example of internal data of the first database.In the example shown in FIG. 8, the reference data is not evenly dividedinto three or more ranges, but unevenly divided into ranges thataccommodate different numbers of pieces of data, such as 500 pieces inthe range 52 a and 100 pieces in the range 52 b. These ranges may bedetermined depending on the distribution of frequencies with whichreference data occurs in communication data. Specifically, the rangesmay be determined so that the sums of the frequencies of occurrence ofreference data belonging to the respective ranges are almost the same.Accordingly, the search efficiency can be improved. The reference datato be input to the comparison circuits 35 a-35 c of the index circuit 34may be modifiable from the outside. In such case, the ranges can bedynamically set, so that the search efficiency will be optimized.

FIG. 9 shows a configuration of comparison circuits included in thebinary search circuit. As mentioned previously, the binary searchcircuit 36 includes 1024 comparison circuits, such as 36 a, 36 b, . . .. Each of the comparison circuits 36 a, 36 b, etc. receives 1 bit ofreference data 54 and 1 bit of comparison target data 56 to compare thebits in value. The comparison circuits 35 a-35 c of the index circuit 34have similar internal configurations. Since the comparison processing isthus performed by a dedicated hardware circuit, a large number ofcomparison circuits can be operated in parallel to compare a largenumber of bits at a time, thereby speeding up the comparison processing.

FIG. 10 shows an example of internal data of the second database. Thesecond database 60 includes a search result field 62, which contains asearch result of the search circuit 30, and a processing content field64, which contains a processing content to be performed on communicationdata. The database stores the search results and the processing contentsrelated to each other. In the example of FIG. 10, conditions areestablished such that a packet will be allowed to pass if itscommunication data contains reference data; if not, the packet will bediscarded. The process execution circuit 40 searches the second database60 for a processing content based on the search result and performs theprocessing on the communication data. The process execution circuit 40may also be configured with a wired logic circuit.

FIG. 11 shows another example of internal data of the second database.In the example of FIG. 11, the processing content is set for each pieceof reference data. With regard to packet replacement, replacement datamay be stored in the second database 60. As for packet routing orswitching, information on the route may be stored in the second database60. The process execution circuit 40 performs processing, such asfiltering, routing, switching, or replacement, which is specified in thesecond database 60, in accordance with the search result of the searchcircuit 30. When the processing content is set for each piece ofreference data, as shown in FIG. 11, the first database 50 and thesecond database 60 may be merged with each other.

The first database and the second database are configured to berewritable from the outside. By replacing these databases, various typesof data processing and communication control can be achieved using thesame communication control apparatus 10. Also, multistage searchprocessing may be performed by providing two or more databases thatstore reference data to be searched. In such instance, more complicatedconditional branching may be performed by providing two or moredatabases that store search results and processing contents related toeach other. When multiple databases are thus provided to conductmultistage search, a plurality of the position detection circuits 32,the index circuits 34, the binary search circuits 36, etc. may also beprovided.

The data intended for the foregoing comparison may be compressed by thesame compression logic. If both the source data and the target data tobe compared are compressed by the same method, the comparison can beperformed in the same manner as usual, thus reducing the amount of datato be loaded for comparison. The smaller amount of data to be loaded canreduce the time required to read out the data from the memory, therebyreducing the overall processing time. Moreover, the number ofcomparators can be also reduced, which contributes to theminiaturization, weight saving, and cost reduction of the apparatus. Thedata intended for comparison may be stored in a compressed form, or maybe read out from the memory and compressed before comparison.

For the data processing apparatus stated above, the following aspectsmay be provided.

[Aspect 1]

A data processing apparatus comprising:

a first memory unit which contains reference data to be referred to whendetermining contents of processing to be performed on acquired data;

a search section which searches the data for the reference data bycomparing the data and the reference data;

a second memory unit which stores a result of search obtained by thesearch section and the contents of processing in association with eachother; and

a processing section which performs the processing associated with theresult of search on the data, based on the result of search, wherein

the search section is composed of a wired logic circuit.

[Aspect 2]

The data processing apparatus of Aspect 1, wherein the wired logiccircuit includes a plurality of first comparison circuits which comparethe data with the reference data bit by bit.

[Aspect 3]

The data processing apparatus of Aspect 1, wherein the search sectionincludes a position detection circuit which detects in the data aposition of comparison target data to be compared with the referencedata.

[Aspect 4]

The data processing apparatus of Aspect 3, wherein the positiondetection circuit includes a plurality of second comparison circuitswhich compare the data with position identification data for identifyingthe position of the comparison target data, and wherein the plurality ofsecond comparison circuits receive the data, each having a shift of apredetermined data length, and compare the data with the positionidentification data simultaneously in parallel.

[Aspect 5]

The data processing apparatus of Aspect 1 or 2, wherein the searchsection includes a binary search circuit which searches the data for thereference data by binary search.

[Aspect 6]

The data processing apparatus of Aspect 5, wherein, when the number ofpieces of the reference data is smaller than the number of pieces ofdata storable in the first memory unit, the reference data is stored inthe first memory unit in descending order from the last data position,while 0 is stored in the rest of the data.

[Aspect 7]

The data processing apparatus of any one of Aspects 1 to 6, wherein thesearch section includes a determination circuit which determines whichrange the comparison target data to be compared with the reference datapertains to, out of three or more ranges into which the plurality ofpieces of reference data stored in the first memory unit are divided.

[Aspect 8]

The data processing apparatus of Aspect 7, wherein the determinationcircuit include a plurality of third comparison circuits which comparereference data at borders of the ranges with the comparison target dataso that the plurality of third comparison circuits determine which ofthe three or more ranges the comparison target data pertains tosimultaneously in parallel.

[Aspect 9]

The data processing apparatus of Aspect 7 or 8, wherein the ranges aredetermined depending on a distribution of frequencies of occurrence ofthe reference data in the data.

[Aspect 10]

The data processing apparatus of any one of Aspects 1 to 9, wherein thefirst memory unit further contains information that indicates theposition of the comparison target data in the data, and wherein thesearch section extracts the comparison target data based on theposition-indicating information.

[Aspect 11]

The data processing apparatus of any one of Aspects 1 to 10, wherein thefirst memory unit or the second memory unit is configured to berewritable from the outside.

Next, a URL filtering technique using the communication controlapparatus 10 discussed above will be described.

FIG. 12 shows an internal configuration of the packet processing circuit20 used for URL filtering. The packet processing circuit 20 comprises,as the first database 50, a user database 57, a virus list 161, awhitelist 162, a blacklist 163 and a common category list 164. The userdatabase 57 stores information on users who use the communicationcontrol apparatus 10. The communication control apparatus 10 receives,from a user, information for identifying the user, and performs matchingbetween the information received by the search circuit 30 therein andthe user database 57 to authenticate the user. For the user-identifyinginformation, a source address stored in the IP header of a TCP/IPpacket, or a user ID and a password provided by a user may be used. Inthe former case, storage location of a source address in a packet isalready known. Accordingly, when the search circuit 30 performs matchingwith the user database 57, the position detection circuit 32 need notdetect the position, and the only thing required there is to specify, asthe offset 51, the storage location of the source address. After theuser is authenticated as a user registered in the user database 57, theURL of a content is checked against the virus list 161, whitelist 162,blacklist 163 and common category list 164, in order to determinewhether or not the access to the content should be permitted. Thewhitelist 162 and blacklist 163 are provided for each user, and when auser ID is uniquely specified after the user authentication, thewhitelist 162 and blacklist 163 for the user are provided to the searchcircuit 30.

The virus list 161 contains a list of URLs of contents containingcomputer viruses. If a URL is contained in the virus list 161, a requestfor access to the content having such URL will be denied. The whitelist162 is provided for each user and contains a list of URLs of contents towhich access is permitted. The blacklist 163 is also provided for eachuser but contains a list of URLs of contents to which access isprohibited. FIG. 13A shows an example of internal data of the virus list161. Similarly, FIG. 13B shows an example of internal data of thewhitelist 162, and FIG. 13C shows that of the blacklist 163. Each of thevirus list 161, whitelist 162 and blacklist 163 contains a categorynumber field 165, a URL field 166 and a title field 167. The URL field166 contains a URL of a content to which access is permitted orprohibited. The category number field 165 contains a category number ofa content. The title field 167 contains a title of a content.

The common category list 164 contains a list for classifying contentsrepresented by URLs into multiple categories. FIG. 14 shows an exampleof internal data of the common category list 164. The common categorylist 164 also contains the category number field 165, URL field 166 andtitle field 167.

The communication control apparatus 10 extracts a URL included in a“GET” request message and searches the virus list 161, whitelist 162,blacklist 163 and common category list 164 for the URL using the searchcircuit 30. At this time, a character string “http://”, for example, maybe detected by the position detection circuit 32 so as to extract thesubsequent data string as target data. Then, the index circuit 34 andbinary search circuit 36 perform matching between the extracted URL andthe reference data in the virus list 161, whitelist 162, blacklist 163and common category list 164.

FIGS. 15A, 15B, 15C and 15D show examples of internal data of the seconddatabase 60 used for URL filtering. FIG. 15A shows the search result andprocessing content with respect to the virus list 161. If a URL includedin a GET request matches a URL included in the virus list 161, theaccess to the URL will be prohibited. FIG. 15B shows the search resultand processing content with respect to the whitelist 162. If a URLincluded in a GET request matches a URL included in the whitelist 162,the access to the URL will be permitted. FIG. 15C shows the searchresult and processing content with respect to the blacklist 163. If aURL included in a GET request matches a URL included in the blacklist163, the access to the URL will be prohibited.

FIG. 15D shows the search result and processing content with respect tothe common category list 164. As shown in FIG. 15D, a user candeterminer with respect to each of the categories, the permission orprohibition of the access to contents belonging to the category, inrelation to the results of search through the common category list 164.The second database 60 for the common category list 164 contains a userID field 168 and a category field 169. The user ID field 168 contains anID for identifying a user. The category field 169 contains informationthat indicates the permission or prohibition of the access to contentsbelonging to respective categories, which is determined by a user foreach of 57 categories classified. If a URL included in a GET requestmatches a URL included in the common category list 164, the permissionfor the access to the URL will be determined according to the categorythat the URL belongs to and the user ID. Although the number of commoncategories is 57 in FIG. 15D, it is not limited thereto.

FIG. 16 shows the priorities of the virus list 161, whitelist 162,blacklist 163 and common category list 164. In the base technology, thevirus list 161, whitelist 162, blacklist 163 and common category list164 have higher priorities in this order. For example, even though a URLof a content appears in the whitelist 162 and the access thereto ispermitted therein, the access will be prohibited if the URL also appearsin the virus list 161, as it is determined that the content contains acomputer virus.

When conventional software-based matching is performed in considerationof such priorities, the matching is performed on the lists, for example,in descending order of priority and the first match is employed.Alternatively, the matching is performed on the lists in ascending orderof priority, and the latest match is employed to replace the precedingmatch. In the base technology using the communication control apparatus10 configured with a dedicated hardware circuit, in contrast, there areprovided a search circuit 30 a for performing matching with respect tothe virus list 161, a search circuit 30 b for performing matching withrespect to the whitelist 162, a search circuit 30 c for performingmatching with respect to the blacklist 163, and a search circuit 30 dfor performing matching with respect to the common category list 164;these search circuits 30 perform matching simultaneously in parallel.When matches are found in multiple lists, the one with the highestpriority is employed. Thus, even when multiple databases are providedand the priorities thereof are defined, the search time can be reducedremarkably.

The priorities of the virus list 161, whitelist 162, blacklist 163 andcommon category list 164, with which the permission of access isdetermined, may be provided in the second database 60, for example. Theconditions in the second database 60 may be modified depending on thepriorities of the lists.

When access to a content is permitted, the process execution circuit 40outputs a signal to the message output server 130 to convey thepermission. The message output server 130 then transmits a “GET” requestmessage to the server retaining the content. When access to a content isprohibited, the process execution circuit 40 outputs a signal to themessage output server 130 to convey the prohibition, and the messageoutput server 130 then discards a “GET” request message for the serverof access destination without transmitting it. At this time, a responsemessage conveying the prohibition of the access may be transmitted tothe request source. Alternatively, transfer to another web page may beforced. In this case, the process execution circuit 40 changes thedestination address and URL to those of the transfer destination andtransmits the “GET” request message. Information including such responsemessage or URL of the transfer destination may be stored in the seconddatabase 60 or the like.

With the configuration and operation as described above, access to aninappropriate content can be prohibited.

Also, since the search circuit 30 is a dedicated hardware circuitconfigured with FPGA, etc., high-speed search processing can beachieved, as discussed previously, and filtering process can beperformed with minimal effect on the traffic. By providing suchfiltering service, an Internet service provider can provide added value,thus gaining more users.

The whitelist 162 or blacklist 163 may be mutually provided for allusers.

EMBODIMENT

There will now be described an operating technique for the case wheremultiple communication control apparatuses 10 are provided in thecommunication control system 100. For example, it is assumed that, inthe aforementioned communication control apparatus 10 for performingfiltering control using URLs, the storage apparatus for storingdatabases, such as a RAM (Random Access Memory), has a capacity forstoring data of 100,000 users. In such case, if the number of users ofthe communication control system 10 exceeds 100,000, the communicationcontrol apparatus 10 will need to be replaced with another communicationcontrol apparatus 10 that has a storage apparatus capable of storingdata of over 100,000 users. In the technique of the present embodiment,on the other hand, multiple communication control apparatuses areprovided, and each of them has a storage apparatus for storing a shareof databases including the whitelist 162 and blacklist 163. By operatingsuch apparatuses cooperatively, they can function as one largecommunication control apparatus 10. Therefore, even if the number ofusers exceeds the capacity of the communication control apparatus 10,such increase of users can be handled by newly adding a communicationcontrol apparatus. In this way, the present embodiment proposes atechnique for operating the communication control apparatus 10 that isversatile and flexible. Such technique can reduce man-hours and costsrequired for system modification due to increase of users. In addition,initial investment can be also reduced because a large scale system doesnot has to be constructed at the beginning in expectation of increase ofusers; only an appropriate number of communication control apparatusesneed to be provided based on the number of users, instead.

FIG. 17 shows a configuration of the communication control system 100according to the embodiment. In the communication control system 100 ofthe present embodiment are provided multiple communication controlapparatuses 10 a, 10 b, 10 c, etc., which are cooperatively operated tofunction as the communication control apparatus 10 described in the basetechnology. Other configurations and operations are the same as those ofthe communication control system 100 according to the base technologyshown in FIG. 1.

In the communication control system 100 of the present embodiment areprovided as many as the number of communication control apparatusesrequired to share and store at least part of databases necessary forpacket processing, and at least one more apparatus is provided extra. Inthe aforementioned example, when the number of users is 300,000 or abovebut less than 400,000, the number of communication control apparatusesrequired for operation is four. However, one or more communicationcontrol apparatuses should be further provided as standby units in caseany of the communication control apparatuses in operation fails or incase a database in any of the communication control apparatuses isupdated. Accordingly, at least five communication control apparatusesare provided in total. Conventionally, the entire system has needed tobe duplexed considering fault tolerance. According to the technique ofthe present embodiment, in contrast, a divided unit of the communicationcontrol apparatus 10 may be only provided extra, thereby enabling costreduction. The operating state of the multiple communication controlapparatuses 10 a, 10 b, 10 c, etc. is managed by the operationmonitoring server 110. The operation monitoring server 110 of thepresent embodiment has a management table for managing the operatingstate of the communication control apparatuses.

FIG. 18 shows configurations of the communication control apparatuses 10according to the present embodiment. In the configuration shown in FIG.18, the search circuit 30 and process execution circuit 40 correspond toa data processing unit, and the configuration retaining the firstdatabase 50 and second database 60 corresponds to a data retaining unitin the present invention. In the example shown in FIG. 18, a dataretaining unit and the corresponding data processing unit are providedin each of the communication control apparatuses 10 a, 10 b and 10 c.These communication control apparatuses may be collectively provided ina single apparatus. Also, a single data processing unit may refer todatabases stored in multiple data retaining units to process data. Thedata retaining unit may be a storage apparatus, such as a RAM, or may bea part of the area in the storage apparatus. Also, multiple storageapparatuses may be considered as one data retaining unit.

Among the databases used for packet processing in the communicationcontrol apparatuses 10, the whitelist 162 (FIG. 13B) and blacklist 163(FIG. 13C) in the first database 50, and a portion of the commoncategory list in which access permission is determined for each user(FIG. 15D) in the second database 60 need larger capacity in proportionto the increasing number of users. Accordingly, these databases aredivided into portions to be stored by the data retaining units of thecommunication control apparatuses 10 a, 10 b, 10 c, etc. Such database,i.e. the whitelist 162 or blacklist 163, corresponds to the secondarydatabase in the present invention. Since the virus list 161 (FIG. 13A)and common category list 164 (FIG. 14) are mutually used by all usersand do not need very large capacity, these lists are mutually stored inthe data retaining units of all the communication control apparatuses 10a, 10 b, 10 c, etc.

As will be discussed later, in the communication control system 100 ofthe present embodiment, a communication packet to be processed is sentto all the communication control apparatuses 10 a, 10 b, 10 c, etc. inoperation, and the respective communication control apparatuses thendetermine whether or not to process the packet. Thereafter, only thecommunication control apparatus that handles processing of the packet,i.e. the communication control apparatus that retains data of the userwho has sent the packet, processes the packet, and the othercommunication control apparatuses discard the packet. Therefore, theuser database 57 is essential as it stores data for determining whichcommunication control apparatus is used to process the packet, andhence, packets cannot be processed without the user database 57.Accordingly, the user database 57 is mutually retained in all thecommunication control apparatuses. The user database 57 corresponds tothe primary database in the present invention.

In the present embodiment, each of the communication control apparatuses10 a, 10 b, 10 c, etc. stores the user database 57 containing data ofall users. Each of the communication control apparatuses is notified bythe operation monitoring server 110 of the range of user IDs assigned tousers whom the communication control apparatus should handle. Each ofthe apparatuses then refers to the data of user IDs within the notifiedrange in the user database 57 to authenticate a user, and determineswhether or not to process a packet that the apparatus has received.

FIG. 19 shows an example of internal data of a management table 111provided in the operation monitoring server 110. The management table111 includes apparatus ID fields 112, operating state fields 113 anduser ID fields 114. The apparatus ID fields 112 contain the apparatusIDs of the communication control apparatuses 10 a, 10 b, etc. Theoperating state fields 113 contain the operating state of thecommunication control apparatuses, and the user ID fields 114 containthe ranges of user IDs handled by the communication control apparatuses.The operating state appears as “operating”, “standby”, “failure”, “dataupdating”, etc. The operating state fields 113 are updated by theoperation monitoring server 110 each time the operating state of thecommunication control apparatuses 10 a, 10 b, etc. changes. In theexample shown in FIG. 19, “465183” users are using the communicationcontrol system 100, so that the five communication control apparatuses10 having the apparatus IDs “1”-“5” are in operation while thecommunication control apparatus 10 having the apparatus ID “6” is in astandby state.

The operation monitoring server 110 monitors the operating state ofmultiple communication control apparatuses 10. When detecting any of thecommunication control apparatuses 10 being inoperable because of sometrouble, the operation monitoring server 110 stores, in thecommunication control apparatus 10 on standby, the same data as storedin the inoperable apparatus, and places the standby communicationcontrol apparatus 10 in operation. For example, when the communicationcontrol apparatus 10 with the apparatus ID “2” halts the operationbecause of a failure, as shown in FIG. 20, the communication controlapparatus 10 with the apparatus ID “6”, which has been on standby,stores the data of user IDs “100001-200000” and starts operating. Thus,even if a communication control apparatus 10 stops because of sometrouble, the main operation will be continued properly. Thecommunication control apparatus 10 on standby may store any of the datain advance to be made in a hot standby state, or may be in a coldstandby state.

In the present embodiment, the data retaining unit of the communicationcontrol apparatus 10 on standby stores the user database 57 in advance.Accordingly, even if a communication control apparatus 10 becomesinoperable, the communication control apparatus 10 on standby can beplaced in operation promptly. As mentioned previously, each of thecommunication control apparatuses 10 determines whether or not toprocess a packet using user IDs. Therefore, if a communication controlapparatus 10 becomes inoperable, and if the system then receives apacket from a user handled by that communication control apparatus 10,there will be no communication control apparatus 10 for processing thepacket, and hence, the packet will remain unprocessed. In order torecover from such situation as quickly as possible, the data retainingunit of the communication control apparatus 10 on standby stores theuser database 57 in advance, and the operation monitoring server 110notifies the standby apparatus of the range of user IDs handled by theinoperable communication control apparatus 10 so that the standbycommunication control apparatus 10 can handle the users instead.Consequently, the communication control apparatus 10 on standby can beplaced in operation promptly, so that the chance that a packet remainsunprocessed can be minimized.

If the communication control apparatus 10 on standby stores all thedatabases including the whitelist 162 and blacklist 163 before theapparatus is placed in operation, the situation of a packet remainingunprocessed could continue for a long time because the storing of thedatabases requires time. Therefore, the communication control apparatus10 on standby may be placed in operation when only the user database 57is stored therein. Although this cannot provide the complete URLfiltering service, the situation of packets remaining unprocessed can beavoided. The databases that have not yet been stored may be storedduring maintenance or database updating, which will be described later.The databases that are mutually used, such as the virus list 161 andcommon category list 164, may be also stored in the communicationcontrol apparatus 10 on standby in advance. Accordingly, when thestandby apparatus is placed in operation, part of the service such asdenying access to URLs contained in the virus list 161 can be provided.

Next, the procedure for updating databases stored in the communicationcontrol apparatuses 10 will be described. The database server 150acquires the latest database from the URL database 160 at a certain timeand retains it therein. The database server 150 also updates the userdatabase upon registration of a new user or withdrawal of a userregistration and retains it therein. In order to reflect, in acommunication control apparatus 10, the latest database retained in thedatabase server 150, the operation monitoring server 110 transfers thedata from the database server 150 and stores it in the communicationcontrol apparatus 10 at a certain time.

FIGS. 21A, 21B and 21C are diagrams for describing the procedure forupdating databases. As with FIG. 19, FIG. 21A shows that thecommunication control apparatuses 10 with the apparatus IDs “1”-“5” arein operation while the communication control apparatus 10 with theapparatus ID “6” is on standby. At the time when a database is to beupdated, the operation monitoring server 110 identifies thecommunication control apparatus 10 in a standby state then and instructsthe database server 150 to store the data in the communication controlapparatus 10. In the example shown in FIG. 21A, the communicationcontrol apparatus 10 with the apparatus ID “6” is on standby, so thatthe database server 150 stores the data in that apparatus. The operationmonitoring server 110 then changes the operating state field 113 for theapparatus ID “6” to “data updating”.

FIG. 21B shows a state where a database of a communication controlapparatus 10 is being updated. The database server 150 stores, in theuser database 57 in the communication control apparatus 10 with theapparatus ID “6” on standby, the data of users handled by one of thecommunication control apparatuses 10 in operation. The data of the viruslist 161, whitelist 162, blacklist 163, common category list 164 andsecond database 60 are also stored therein. In the example shown in FIG.21B, the data of users with the user IDs “000001-100000”, which havebeen handled by the communication control apparatus 10 with theapparatus ID “1”, are stored in the communication control apparatus 10with the apparatus ID “6”.

FIG. 21C shows a state where the communication control apparatus 10 withthe apparatus ID “6” has had its database updated and is placed inoperation, and the communication control apparatus 10 with the apparatusID “1” is placed into a standby state instead. Upon completion ofstoring data in the communication control apparatus 10 with theapparatus ID “6”, the operation monitoring server 110 starts theoperation of the apparatus, which stores the updated database. Theoperation monitoring server 110 also stops the operation of thecommunication control apparatus 10 with the apparatus ID “1”, whichstores the database before update, to place the apparatus into a standbystate. Thus, the communication control apparatus 10 with an updateddatabase is placed in operation. Then, the data of users with the userIDs “100001-200000” are stored in the communication control apparatus 10with the apparatus ID “1” before the apparatus is placed in operation,and, subsequently, the operation of the communication control apparatus10 with the apparatus ID “2” is stopped. Thereafter, databases aresimilarly updated by turns, so that the databases of all thecommunication control apparatuses 10 can be updated behind the actualoperation, without halting the operation of the communication controlsystem 100.

In this way, data stored in each of the communication controlapparatuses 10 is not fixed in the present embodiment, and hence, thecommunication control apparatus 10 that stores data of a certain userchanges with time. If, before a packet is sent to each of thecommunication control apparatuses 10, the process of determining whichcommunication control apparatus 10 stores the data of the user isperformed, the time for the process will be additionally required.Accordingly, in the present embodiment, a received packet is provided toall the communication control apparatuses 10, and each of theapparatuses then determines whether or not it stores the data of theuser who has sent the packet. Thereafter, only the communication controlapparatus 10 that stores the data processes the packet, and the othercommunication control apparatuses 10 not having the data disregard thepacket. In the following, a technique for providing such mechanism willbe described.

FIG. 22 shows a configuration of a communication path control apparatusprovided to process packets with multiple communication controlapparatuses 10. A communication path control apparatus 200 comprises aswitch 210, an optical splitter 220, which is an example of a datasupply unit, and a switch 230. The switch 210 transmits a receivedpacket to the communication control apparatuses 10. Between the switch210 and the communication control apparatuses 10, there is provided theoptical splitter 220 that provides the packet to the multiplecommunication control apparatuses 10 a, 10 b and 10 c in parallel. Theswitch 210 practically transmits a packet to the optical splitter 220,which transmits the packet to each of the communication controlapparatuses in parallel.

If a packet is converted to a broadcast packet so as to be transmittedto the multiple communication control apparatuses 10 a, 10 b and 10 c,additional process such as adding a time stamp to the header will berequired, which reduces the processing speed. Therefore, a packet is notconverted but split by the optical splitter 220 so as to be transmittedas a unicast packet to the multiple communication control apparatuses 10a, 10 b and 10 c. This method will be called “parallelcast” in thepresent specification.

Each of the communication control apparatuses is not set to a mode inwhich an apparatus receives only packets directed to the MAC address ofthe apparatus, but set to promiscuous mode in which an apparatusreceives all packets regardless of the destination MAC addresses. Whenreceiving a packet sent via parallelcast from the optical splitter 220,each of the communication control apparatuses omits MAC address matchingand acquires every packet. Each of the apparatuses then refers to theuser database 57 stored in the data retaining unit to perform user IDmatching as described in the base technology, and determines whether ornot the apparatus should process the packet. In the example shown inFIG. 22, since the data of a user who has sent a packet is stored in thecommunication control apparatus 10 c, the communication controlapparatuses 10 a and 10 b discard the packet while the communicationcontrol apparatus 10 c performs the URL filtering as describedpreviously.

If the packet needs to be returned to the user because, for example, theaccess has been prohibited, the communication control apparatus 10 cwill transmit a response packet to the switch 210 bypassing the opticalsplitter 220. If the communication control apparatus 10 c processes thepacket and the access thereto is permitted, the communication controlapparatus 10 c will transmit the packet to the destination of therequest for the content. Between the communication control apparatuses10 and the upstream communication line, there is provided the switch 230by which packets transmitted from the multiple communication controlapparatuses 10 a, 10 b and 10 c are aggregated. The communicationcontrol apparatus 10 c will practically transmit the packet to theswitch 230, which transmits the packet to the upstream communicationline.

When the switch 230 receives a packet transmitted in return from thedestination of a request for a content, since this packet need not beprocessed by the communication control apparatuses 10, the packet istransmitted from the port 232 of the switch 230 to the port 212 of theswitch 210. Thereafter, the packet is transmitted from the switch 210 tothe user. On the Internet, the transmission path is generally recordedin the packet to ensure the return path through which a response packetsent in return for the packet can be certainly delivered to thetransmission source. In the present embodiment, however, since thereturn path is already provided within the communication path controlapparatus 200, communication can be performed between apparatuseswithout recording the path or processing the packet. Consequently,unnecessary process can be eliminated, thereby improving the processingspeed.

The example in FIG. 22 shows the case where, when a user sent a packetincluding a request for acquiring a content, only the packet sent fromthe user to the server that retains the content is processed, while apacket transmitted from the server to the user is made to pass throughwithout being processed. Alternatively, the communication path controlapparatus 200 may be configured so that the communication controlapparatuses 10 process packets transmitted in both directions. In suchcase, the optical splitters 220 may be provided on both sides of thecommunication control apparatuses 10. Also, the bypass path from theswitch 230 to switch 210 need not be provided.

In such way, by sending a packet via parallelcast to all thecommunication control apparatuses, the packet can be appropriatelyprocessed by the proper communication control apparatus among themultiple communication control apparatuses, without the need to specify,in advance, a communication control apparatus by which the packet is tobe processed.

The present invention has been described with reference to theembodiment. The embodiment is intended to be illustrative only and itwill be obvious to those skilled in the art that various modificationsto constituting elements or processes could be developed and that suchmodifications also fall within the scope of the present invention.

The embodiment describes a communication control system in which each ofmultiple communication control apparatuses has a data processing unitand a data retaining unit. However, the technique of the presentinvention is equally applicable to the case where a single dataprocessing apparatus has multiple data retaining units. The technique isalso applicable to the case where a single data processing unit refersto databases stored in multiple data retaining units to process data.With regard to the data retaining units on standby, two or more unitsmay be provided.

INDUSTRIAL APPLICABILITY

The present invention is applicable to a data processing system thatincludes multiple databases.

1. A data processing system, comprising: a data processing unit whichprocesses data acquired; and a plurality of data retaining units whichstore databases used to process the data, wherein: each of the pluralityof data retaining units stores a primary database in common and storesthe respective shares of a secondary database; and the data processingsystem further comprises at least one more data retaining unit which canstore the primary database and the respective shares of the secondarydatabase.
 2. The data processing system of claim 1, wherein the primarydatabase contains data for determining which share of the secondarydatabase is to be used to process the data.
 3. The data processingsystem of claim 1, further comprising an operation management unit whichmanages the operating state of the plurality of data retaining units,wherein: the operation management unit operates as many as the number ofdata retaining units required to share and store the secondary database,and places the other data retaining unit on standby; and, when adatabase retained in the data retaining units is updated, the operationmanagement unit stores, in a data retaining unit on standby, updateddata of the database retained in any one of the data retaining units inoperation, and subsequently stops the operation of the data retainingunit storing the database before update and places the data retainingunit storing the updated database in operation.
 4. The data processingsystem of claim 3, wherein, when detecting a data retaining unit inoperation being inoperable, the operation management unit stores thedatabase retained by the data retaining unit in a data retaining unit onstandby, and places the data retaining unit on standby in operation. 5.The data processing system of claim 3, wherein the data retaining uniton standby stores the primary database in advance.
 6. The dataprocessing system of claim 3, wherein: a plurality of the dataprocessing units are provided so as to correspond to the plurality ofdata retaining units respectively; and the data processing systemfurther comprises a data supply unit which provides acquired data to theplurality of data processing units in parallel.
 7. The data processingsystem of claim 6, wherein the data supply unit provides acquired dataas it is to the plurality of data processing units in parallel withoutprocessing the data.
 8. The data processing system of claim 6, wherein,upon acquisition of data from the data supply unit, each of theplurality of data processing units refers to a database retained in thecorresponding data retaining unit so as to determine whether or not toprocess the data.
 9. The data processing system of claim 8, wherein thedata processing units are communication control apparatuses whichacquire packets to control communications, and, upon acquisition of apacket from the data supply unit, each of the data processing unitsacquires the packet without determining whether the packet is directedto the data processing unit itself, and refers to a database retained inthe corresponding data retaining unit so as to determine whether or notto process the packet.
 10. The data processing system of claim 9,wherein each of the communication control apparatuses uses informationstored in the data portion of the packet instead of information storedin the header portion thereof to determine whether or not to process thepacket.
 11. The data processing system of claim 9, wherein each of thecommunication control apparatuses refers to the primary databaseretained in the corresponding data retaining unit so as to determinewhether or not to process the packet.
 12. The data processing system ofclaim 9, wherein a data processing unit that has determined to processthe packet processes the packet, while a data processing unit that hasdetermined not to process the packet discards the packet.
 13. The dataprocessing system of claim 9, wherein the data supply unit provides tothe plurality of communication control apparatuses in parallel anacquired packet as a unicast packet without converting the packet to abroadcast packet.
 14. The data processing system of claim 2, furthercomprising an operation management unit which manages the operatingstate of the plurality of data retaining units, wherein: the operationmanagement unit operates as many as the number of data retaining unitsrequired to share and store the secondary database, and places the otherdata retaining unit on standby; and, when a database retained in thedata retaining units is updated, the operation management unit stores,in a data retaining unit on standby, updated data of the databaseretained in any one of the data retaining units in operation, andsubsequently stops the operation of the data retaining unit storing thedatabase before update and places the data retaining unit storing theupdated database in operation.
 15. The data processing system of claim4, wherein the data retaining unit on standby stores the primarydatabase in advance.
 16. The data processing system of claim 4, wherein:a plurality of the data processing units are provided so as tocorrespond to the plurality of data retaining units respectively; andthe data processing system further comprises a data supply unit whichprovides acquired data to the plurality of data processing units inparallel.
 17. The data processing system of claim 5, wherein: aplurality of the data processing units are provided so as to correspondto the plurality of data retaining units respectively; and the dataprocessing system further comprises a data supply unit which providesacquired data to the plurality of data processing units in parallel. 18.The data processing system of claim 7, wherein, upon acquisition of datafrom the data supply unit, each of the plurality of data processingunits refers to a database retained in the corresponding data retainingunit so as to determine whether or not to process the data.
 19. The dataprocessing system of claim 10, wherein each of the communication controlapparatuses refers to the primary database retained in the correspondingdata retaining unit so as to determine whether or not to process thepacket.
 20. The data processing system of claim 10, wherein a dataprocessing unit that has determined to process the packet processes thepacket, while a data processing unit that has determined not to processthe packet discards the packet.